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Abstract.  1.  Determining  large-scale  distribution  patterns  for  mosquitoes  could  ad¬ 
vance  knowledge  of  global  mosquito  biogeography  and  inform  decisions  about  where 
mosquito  inventory  needs  are  greatest. 

2.  Over  43  000  georeferenced  records  are  presented  of  identified  and  vouchered 
mosquitoes  from  collections  undertaken  between  1899  and  1982,  from  1853  locations 
in  42  countries  throughout  the  Neotropics.  Of  492  species  in  the  data  set,  23%  were  only 
recorded  from  one  location,  and  Anopheles  albimanus  Wiedemann  is  the  most  common 
species. 

3.  A  linear  log-log  species-area  relationship  was  found  for  mosquito  species  number 
and  country  area.  Chile  had  the  lowest  relative  density  of  species  and  Trinidad-Tobago 
the  highest,  followed  by  Panama  and  French  Guiana. 

4.  The  potential  distribution  of  species  was  predicted  using  an  Ecological  Niche 
Modelling  (ENM)  approach.  Anopheles  species  had  the  largest  predicted  species  ranges, 
whereas  species  of  Deinocerites  and  Wyeomyia  had  the  smallest. 

5.  Species  richness  was  estimated  for  1°  grids  and  by  summing  predicted  presence  of 
species  from  ENM.  These  methods  both  showed  areas  of  high  species  richness  in  French 
Guiana,  Panama,  Trinidad-Tobago,  and  Colombia.  Potential  hotspots  in  endemicity  included 
unsampled  areas  in  Panama,  French  Guiana,  Colombia,  Belize,  Venezuela,  and  Brazil. 

6.  Argentina,  The  Bahamas,  Bermuda,  Bolivia,  Cuba,  and  Peru  were  the  most  under¬ 
represented  countries  in  the  database  compared  with  known  country  species  occurrence 
data.  Analysis  of  species  accumulation  curves  suggested  patchiness  in  the  distribution 
of  data  points,  which  may  affect  estimates  of  species  richness. 

7.  The  data  set  is  a  first  step  towards  the  development  of  a  global-scale  repository  of 
georeferenced  mosquito  collection  records. 

Key  words,  bioclim,  biogeography,  collections,  database,  distribution,  mosquito, 
Neotropics,  species-area  relationship,  species  endemism,  species  richness. 


Introduction 

With  the  advent  of  powerful  tools  enabling  the  use  of  museum 
specimen  data  for  addressing  questions  in  global  biogeography, 
climate  change,  vector  control,  and  conservation,  there  is  an 
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increasing  need  for  continental  and  global  scale  databases  of 
insect  specimen  data  (Turner  et  al.,  2003;  Elith  et  al.,  2006). 
Although,  continental  scale  databases  have  been  compiled  for 
butterflies  (e.g.  Soberon  et  al.,  2000;  Kerr  et  al.,  2001),  insects 
lag  far  behind  plants  and  vertebrates  in  the  availability  of  such 
data. 

For  mosquitoes,  georeferenced  collections  databases  of  large 
geographic  scale  exist  but  are  usually  not  digitised  or  comprise 
unpublished  survey  and  museum  collection  records.  The  paucity 
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of  detailed  data  on  the  past  and  present  distribution  of  vectors  is 
a  major  limiting  factor  for  global  modelling  of  vector-borne  dis¬ 
eases  (Rogers  &  Randolph,  2003;  Tatem  et  al.,  2006).  There  is 
also  a  challenge  of  choosing  appropriate  priority  taxa  that  can  be 
used  to  represent  biodiversity  in  monitoring  projects  (Pereira  & 
Cooper,  2006),  and  mosquitoes  have  been  suggested  as  a  prior¬ 
ity  group  within  the  insects  (e.g.  Raven,  1980). 

Country  inventories  or  checklists  of  mosquito  species  are 
available,  such  as  the  Walter  Reed  Biosystematics  Unit’s 
(WRBU)  Systematic  Catalog  of  the  Culicidae  (SCC)  (http:// 
www.mosquitocatalog.org/main.asp).  Recently,  Geographical 
Information  Systems  (gis)  software  was  used  to  display  these 
data  as  global  maps  to  investigate  geographic  patterning  of  mos¬ 
quito  species  numbers  (Foley  et  al.,  2007).  These  authors 
described  species-area  relationships  and  a  latitudinal  diversity 
gradient  for  mosquito  species.  Higher  resolution  mosquito  dis¬ 
tribution  data  on  a  global  scale,  however,  are  needed. 

An  extensive  database  of  georeferenced  mosquito  collection 
records  for  the  Neotropical  area  is  reported.  The  Neotropics  have 
been  proposed  as  the  centre  of  origin  of  mosquitoes  (e.g. 

Qu  &  Qian,  1989),  including  the  Anophelinae  (Harbach,  1998; 
Krzywinski  et  al.,  2001).  A  long  history  of  anophelines  in  the 
New  World  is  suggested  from  the  finding  of  Anopheles 
(Nyssorhynchus)  in  Dominican  amber  15-45  million  years  old 
(Zavortink  &  Poinar,  2000).  Belkin  ( 1962)  regarded  the  Caribbean 
islands  as  a  cradle  of  mosquito  evolution;  being  climatically  sta¬ 
ble,  geologically  unstable  and  especially  affected  by  changing 
sea  levels.  The  long  isolation  of  the  New  World,  beginning  with 
the  separation  of  South  America  and  Africa  about  100  million 
years  ago,  appears  to  have  encouraged  the  evolution  of  a  unique 
mosquito  fauna.  For  example,  four  subgenera  of  Anopheles 
(Kerteszia,  Lophopodomyia,  Nyssorhynchus,  and  Stethomyia) 
are  confined  to  the  Neotropics.  Foley  et  al.  (2007)  showed  that 
Brazil  is  a  centre  of  diversity  for  mosquito  species  and  French 
Guiana  has  one  of  the  highest  relative  species  densities  anywhere 
in  the  world.  These  authors  also  showed  that  some  Neotropical 
countries  have  been  a  focus  for  mosquito  systematics  studies. 

The  database  reported  here  began  with  the  Mosquito  In¬ 
formation  Management  Project  (mimp),  initiated  in  1979  to  de¬ 
velop  a  computer-based  system  for  storing  and  retrieving  data 
on  mosquitoes  (Faran  et  al.,  1984).  mimp  was  tasked  with  digi¬ 
tising  the  records  of  paper  collection  forms  for  vouchered  speci¬ 
mens  housed  at  the  National  History  Museum,  Smithsonian 
Institution.  Notable  among  these  sources  was  the  Mosquitoes  of 
Middle  America  (MOMA)  collection  from  the  University  of 
California,  Los  Angeles.  The  MOMA  records  are  a  testament  to 
the  foresight  and  taxonomic  skills  of  John  Belkin,  who  saw  the 
importance  to  future  mosquito  researchers  of  accessible  collec¬ 
tion  details  in  a  standardised  format  (Belkin  &  Heinemann, 
1973;  Zavortink,  1990).  Belkin  &  Heinemann  (1973,  1975a, b, 
1976a, b,c),  Heinemann  &  Belkin  (1977a, b,c,  1978a, b,c,  1979), 
Heinemann  et  al.  (1980),  and  Heinemann  (1980)  contain  re¬ 
cords  for  the  MOMA  collections,  some  of  which  date  back 
to  1899.  In  the  early  1980s,  information  pertaining  to  about 
402000  specimens  on  over  15  500  paper  collection  forms,  in¬ 
cluding  details  on  the  identification,  collection  location,  and  ec¬ 
ological  characteristics  of  the  collection  site,  were  entered  into 
a  computer  (Faran  et  al.,  1984)  utilising  the  selgem  software 


(Creighton  &  Crockett,  1971).  Although  many  paper  records  re¬ 
mained  to  be  digitised,  the  mimp  project  terminated  in  1983,  af¬ 
ter  which  these  digital  records  languished.  In  1999  the 
Smithsonian  Institution  decided  to  convert  all  its  electronic 
specimen  data,  including  the  mosquito  database,  into  a  single 
database.  On  10  September  2001,  the  only  digital  copies  of 
these  records  arrived  at  a  company  a  couple  of  blocks  from  the 
World  Trade  Center  in  New  York,  where  they  were  to  be  con¬ 
verted  from  magnetic  tape  to  a  modern  digital  format.  Despite 
their  proximity  to  ground  zero,  these  records  survived  the  terror¬ 
ist  attacks  the  following  day,  and  were  returned  to  the 
Smithsonian  to  be  converted  to  a  modern  object-relational  hy¬ 
brid  data  structure.  An  analysis  of  a  subset  of  this  database  is 
presented  here  to  demonstrate  the  value  of  digitised  georefer¬ 
enced  collection  data,  in  this  case,  for  questions  about  mosquito 
biogeography  and  survey  history. 

Materials  and  methods 

Faran  et  al.  (1984)  describe  the  composition  of  the  78  categories 
and  subcategories  of  information  within  the  original  mimp  data¬ 
base,  which  included  locality  description,  collection  code,  collec¬ 
tion  date,  latitude  and  longitude,  species  identification,  a  unique 
record  identification  number,  collector,  and  date  of  collection. 
The  database  was  divided  into  those  records  that  had  geographic 
coordinate  data  (i.e.  degrees  longitude  and  latitude)  and  those  that 
did  not.  Entries  were  further  divided  into  those  of  questionable 
taxonomy  (i.e.  species  entries  followed  by  group,  complex  or 
aff.  =  affinity),  apparent  identification  failures  (i.e.  entries  where 
species  identification  was  not  given  or  was  followed  by  uncertain 
or  ?),  and  those  with  unequivocal  species  identification.  Those 
without  geographic  coordinates  were  divided  into  those  that  had 
Military  Grid  Reference  System  (MGRS)  coordinates  and  those 
that  did  not.  MGRS  coordinates  were  converted  to  geographic 
coordinates  for  WGS-84  using  the  batch  options  in  geotrans 
V 2.2.6  (US  Army  Topographic  Engineering  Center,  Geospatial 
Information  Division).  The  appropriate  horizontal  datum  and 
ellipsoid  were  determined  by  inspection  of  maps  housed  at  the 
WRBU  that  were  originally  used  to  arrive  at  MGRS  coordinates. 
Where  MGRS  data  entry  errors  were  suspected,  such  as  trans¬ 
posed  letters  and  digits  and  incomplete  coordinates,  the  Universal 
Transverse  Mercator  (UTM)  zone  number  and  designator  were 
first  confirmed  by  cross-checking  against  a  world  map  of  UTM 
grids.  Error  detection  at  this  stage  was  helped  by  use  of  the  elec¬ 
tronic  gazetteer  eGAZ,  and  biolink  Map  Assistant  V2. 1.309 
(Shattuck,  1997),  which  located  collection  site  names  on  a  map. 
Many  MGRS  readings  were  to  1  km  precision  but  for  a  number 
of  Caribbean  islands  these  could  be  increased  to  100  m  by  re- 
georeferencing  collection  locations  where  these  points  were  obvi¬ 
ous  on  original  maps.  When  geographic  coordinates  were  already 
present  in  the  database  in  degrees-minutes  format,  these  were 
converted  to  decimal  degrees  and  checked  to  ensure  they  had  the 
correct  sign  (+  or  — )  for  their  hemisphere  of  origin. 

Specimens  with  unequivocal  identifications  and  geocodes  were 
filtered  in  Microsoft  Excel  for  unique  locations,  and  these  point 
data  were  converted  to  shape  files  for  mapping  in  diva-gis  5.3 
(http://www.diva-gis.org/).  Further  data  cleaning  was  undertaken 
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by  the  check  coordinates  option  of  diva-gis,  a  point-in-polygon 
method  (Chapman,  2005a),  which  identifies  points  located  out¬ 
side  all  polygons  (i.e.  fell  in  the  ocean),  and  points  that  did  not 
match  relations  for  the  country  names  (i.e.  fell  in  another  country). 
Locations  so  identified  (n  =  273)  were  rechecked  and  corrected 
by  consulting  original  collection  cards  and  maps  housed  at  the 
WRBU  or  through  the  Alexandria  Digital  Library  online  Gazetteer 
(http://middleware.alexandria.ucsb.edu/client/gaz/adl/index.jsp). 
Data  were  imported  into  arcview  gis  3.3  for  graphical  display. 
Generic,  subgeneric,  and  species  names  were  updated  to  follow 
the  SCC  on  the  WRBU  website  (http://www.mosquitocatalog.org/ 
main.asp,  accessed  23  May  2006).  Mosquito  species  composition 
by  country  was  obtained  from  the  WRBU  website  (accessed  16 
June  2006).  Land  area  of  countries  was  determined  via  the  Central 
Intelligence  Agency  (CIA)  faetbook  website  (http://www.cia.gov/ 
cia/publications/factbook/geos/mh.html,  accessed  19  June  2006). 
Species  number  and  country  area  data  were  log10  transfonned  and 
linear  regression  performed  using  minitab  version  14.20  (Minitab, 
State  College,  PA,  U.S.A.). 

The  mimp  database  is  composed  of  mosquito  collections, 
some  of  which  were  undertaken  from  the  same  location  at  dif¬ 
ferent  dates.  Each  collection  records  one  or  more  individuals  of 
the  same  species,  but  the  number  of  specimens  during  a  single 
collection  was  not  meant  to  be  an  estimate  of  the  abundance  of 
that  species  at  that  location.  It  was  decided  to  act  conservatively 
by  reducing  the  cleaned  subset  of  the  mimp  database  to  record 
only  the  presence/absence  (occurrence  or  incidence)  of  each 
species  in  each  collection. 

Distribution  modelling 

The  potential  distribution  of  species  was  predicted  using  the 
bioclim  algorithm  (Nix,  1986),  an  Ecological  Niche  Modelling 
(ENM)  approach  based  on  climate  matching,  in  diva-gis. 
bioclim  attempts  to  identify  suitable  and  unsuitable  areas  or 
niches  in  which  the  organism  is  likely  to  occur,  or  could  survive 
if  it  was  introduced,  based  on  the  climatic  features  of  the  data 
point  locations.  The  bioclim  model  was  implemented  using  the 
worldclim  2.5  arc  minute  resolution  database  of  19  biocli- 
matic  variables,  that  is,  annual  mean  temperature,  mean  monthly 
temperature  range,  isothermality,  temperature  seasonality,  max¬ 
imum  temperature  of  the  warmest  month,  minimum  tempera¬ 
ture  of  the  coldest  month,  annual  temperature  range,  mean 
temperature  of  the  wettest  quarter,  mean  temperature  of  the  dri¬ 
est  quarter,  mean  temperature  of  the  coldest  quarter,  mean  tem¬ 
perature  of  the  warmest  quarter,  annual  precipitation,  wettest 
month  precipitation,  driest  month  precipitation,  precipitation 
seasonality,  wettest  quarter  mean  precipitation,  driest  quarter 
mean  precipitation,  coldest  quarter  mean  precipitation,  warmest 
quarter  mean  precipitation.  All  areas  that  are  within  the  climate 
envelope  described  by  the  data,  cut  off  beyond  the  0.025  percentile, 
were  mapped  as  true  (1)  or  false  (0).  The  batch  option  was  used 
to  run  bioclim  for  each  species. 

Species  richness 

The  number  of  different  species  (i.e.  species  richness)  and  the 
number  of  observations  was  calculated  from  point  data  and 


displayed  as  maps  in  diva-gis  for  1°  grid  cells.  Potential  species 
richness  was  also  estimated  by  summing  the  incidence  (pres¬ 
ence/absence)  of  each  species  for  each  grid  cell  as  determined 
by  the  bioclim  distribution  model.  This  methodology  is  similar 
to  that  of  Jarvis  et  al.  (2003),  who  used  principal  components 
analysis  applied  to  climate  variables. 

Endemism 

A  weighted  endemism  algorithm  based  on  the  bioclim  poten¬ 
tial  species  distribution  maps  was  used.  This  is  a  variant  of  the 
methodology  used  in  the  Biodiversity  Analysis  Tool  (BAT) 
(http://www.deh.gov.au/biodiversity/abif/bat/index.html,  ac¬ 
cessed  19  June  2006).  Whereas  in  BAT  the  range  size  of  each 
species  is  estimated  as  the  number  of  grid  cells  occupied  by 
records  for  that  species  in  the  data,  the  present  study  used  pre¬ 
dicted  distribution  data  from  the  output  of  ENM.  First,  the  range 
size  of  each  species  was  estimated  from  the  number  of  positive 
2.5  min  grid  cells  from  the  distribution  modelling.  An  ende¬ 
mism  score  for  each  species  grid  file  was  calculated  in  diva-gis 
as  the  inverse  of  the  range  size  multiplied  by  100.  Grid  files  for 
species  so  calculated  were  then  summed.  Species  whose  pre¬ 
dicted  distribution  was  greater  than  100  000  grid  cells  were  not 
included  in  the  calculation  because  they  would  add  only  a  negli¬ 
gible  amount  to  the  final  score.  Also,  species  with  predicted  dis¬ 
tribution  of  one  grid  cell  or  less  were  excluded,  as  it  could  not 
be  demonstrated  that  the  modelling  in  these  cases  yielded  a 
valid  prediction.  For  ease  of  visualising  high  value  grid  cells, 
the  summed  endemism  grid  values  were  recalculated  according 
to  the  maximum  values  of  neighbouring  grid  cells  (9  x  9)  with 
the  neighbourhood  function  of  diva-gis. 

Species  accumulation  curves 

A  square  was  defined,  in  diva-gis,  with  lower  left  and  upper 
right  corners  at  longitude/latitude:  -1187-40°  and  -31735°, 
respectively.  The  projection  was  equal-area  cylindrical,  ensur¬ 
ing  that  grid  cells  are  comparable  with  each  other.  This  area  re¬ 
presented  87  rows  x  75  columns  of  l°xl°  grid  cells 
(total  =  6525  cells).  One  degree  of  either  latitude  or  longitude  at 
the  equator  is  approximately  111  km.  Of  the  6525  grid  cells  in 
the  area,  27%  (1830)  occurred  over  land  and  325  of  these  (18%) 
had  at  least  one  species  observation. 

The  program  Estimates  (version  7.5.1,  Colwell,  2005)  was 
used  to  investigate  under-sampling  and  spatial  aggregation  in 
the  data.  Estimates  calculates  randomised  species  accumula¬ 
tion  curves  (also  known  as  sample-based  rarefaction  curves) 
and  computes  a  variety  of  species  richness  indicators.  For  an 
idealised  complete  inventory  for  an  area  the  species  accumula¬ 
tion  curve  climbs  asymptotically  to  the  true  species  richness 
and  taxa  that  are  rare  have  been  observed  more  than  once.  The 
expected  richness  function  in  Estimates  is  called  Sobs  (Mao 
Tau).  A  Coleman  curve  is  calculated  by  randomly  reassigning 
specimens  to  samples  and  then  recalculating  the  species  accu¬ 
mulation  curve,  thus  removing  any  clumping  in  the  data.  The 
present  study  used  the  incidence-based  Chao2  estimator  and 
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incidence-based  coverage  estimator  ICE ,  which  depends  on  the 
presence  and  distribution  of  rare  taxa,  to  estimate  the  lower 
bounds  of  species  richness  and  to  assess  the  degree  of  under¬ 
sampling.  The  present  study  used  the  default  values  in 
Estimates,  that  is,  50  randomisations  for  estimators  and  10  for 
the  upper  abundance  limit  for  rare  taxa.  Input  files  for 
Estimates  were  obtained  by  using  the  point-to-grid  function 
for  species  richness  in  diva-gis,  which  produces  a  grid  file  of 
the  presence  (1)  or  absence  (0)  of  each  species  in  each  grid  cell. 
The  parameter  field  was  set  for  all  species,  thereby  generating 
492  separate  grid  files.  These  grid  files  were  stacked  together 
in  diva  and  exported  as  a  matrix  in  text  file  format.  Only  cells 
that  had  at  least  one  observation  were  used  («  =  325)  and  these 
records  were  reformatted  for  input  into  Estimates. 

Results 

After  data  cleaning  the  original  87  637  specimen  entries,  43  738 
records,  comprising  7069  collections,  were  suitable  for  inclu¬ 
sion  in  this  analysis.  Most  (61.3%)  records  were  for  specimens 
collected  in  the  1960s,  followed  by  the  1970s  (34.0%),  and  the 
most  recent  record  was  for  1982.  Only  3.7%  of  records  were 
from  before  the  1960s,  with  the  oldest  being  for  Anopheles 
aquasalis  Curry  by  F.  Urich  in  Trinidad  in  1899.  Not  included 
are  7155  records  from  Egypt,  Israel  and  Kenya,  27  692  of  uncer¬ 
tain  taxonomy,  153  miscellaneous,  and  8899  records  that  were 
not  georeferenced.  The  latter  category  was  mostly  from  Panama 
(2864)  followed  by  Brazil  (1627),  Colombia  (937),  Ecuador 
(745),  Mexico  (429),  Nicaragua  (414),  Trinidad-Tobago  (361), 
and  Venezuela  (238).  When  repeat  specimen  collections  (i.e. 


multiple  records  of  a  species  during  a  single  collection)  of  the 
cleaned  data  were  combined,  this  resulted  in  12  505  unique 
species-collection  records.  When  multiple  collections  at  the 
same  location  on  different  dates  were  combined,  this  resulted  in 
6773  unique  species-location  records.  These  records  comprised 
1853  locations  from  42  countries  in  the  Neotropical  region 
(Fig.  1).  Points  that  originally  fell  outside  country  polygons 
were  most  noticeable  for  islands  and  border  areas  but  these  were 
often  due  to  the  combined  error  of  the  input  coordinates  and  the 
inaccuracy  of  country  polygons.  The  database  is  available  at 
http://www.mosquitomap.org  and  in  future  will  be  available 
with  additional  fields,  through  the  Smithsonian  Institution’s  col¬ 
lections  online  database. 

There  are  492  species  listed  and  111  (23%)  of  these  were  sin¬ 
gletons,  being  only  recorded  from  one  location  (Fig.  1).  Highest 
density  of  observations  per  1°  grid  cell  was  recorded  for  the 
Canal  zone  in  Panama,  northern  Venezuela,  and  French  Guiana. 
Figure  2  shows  species  ranked  according  to  the  number  of 
locations  where  they  were  collected.  Anopheles  albimanus 
Wiedemann  appears  at  the  most  sites.  Table  1  shows  the  number 
of  locations  and  the  species  number  per  country.  The  numbers 
of  species  recorded  for  countries  according  to  the  SCC  are  in¬ 
cluded  in  Table  1 .  In  a  number  of  cases  the  present  data  set  has 
more  species  than  were  recorded  for  the  SCC,  although  the  SCC 
is  regularly  updated  and  may  change  to  eliminate  this  discrep¬ 
ancy.  Compared  with  the  species  number  recorded  for  countries 
in  the  SCC,  the  mimp  database  under-samples  Argentina,  The 
Bahamas,  Bermuda,  Bolivia,  Cuba,  and  Peru,  in  particular. 

Figure  3  shows  the  log-log  species-area  relationship  using 
the  number  of  species  per  country  in  the  SCC  (or  the  mimp  data¬ 
base,  whichever  is  the  higher  number  according  to  Table  1). 


Fig.  1.  Collection  locations  (n=  1853)  in 
the  mimp  database.  Larger  circles  indicate 
location  of  species  collected  at  only  one  lo¬ 
cation.  Note  that  many  record  locations  are 
not  visible  due  to  the  scale. 
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Fig.  2.  Species  in  the  mimp  database  ranked  according  to  the  number 
of  locations  where  they  were  collected. 

A  trend  of  increasing  species  number  with  increasing  land  area 
was  found,  as  was  shown  by  Foley  et  al.  (2007)  for  all  coun¬ 
tries.  A  linear  regression  of  log  area  ( x )  against  log  species 
number  (y)  for  each  country  showed  a  highly  significant  positive 
relationship  (y  =  0.3547*  +  0.1064,  F  =  0.6685,  F14Q=80.67, 
PCO.0001,  residual  mean-square  error  =  0.133,  SE  of 
intercept  =  0.1784;  SE  of  slope  =  0.03949).  This  indicated  that 
66.85%  of  the  variation  in  species  richness  is  explained  by  area. 
Islands  in  the  study  area  have  generally  smaller  land  area  and 
less  species  compared  with  mainland  countries.  For  mainland 
countries,  Chile  appears  to  have  the  least  species  for  its 
land  area.  By  contrast,  the  island  nation  of  Trinidad-Tobago  has 
a  relatively  species-rich  mosquito  fauna  for  its  land  area. 

The  results  of  the  species  distribution  modelling  are  shown  in 
terms  of  predicted  grid-cell  area  in  Fig.  4.  Approximately  200 
species  had  a  predicted  species  range  of  only  one  grid  cell  or 
less,  and  it  is  likely  that  more  sample  points  are  needed  for  these 
species  to  adequately  model  their  distribution.  Aedeomyia  squa- 
mipennis  (Lynch  Arribalzaga)  had  the  largest  predicted  species 
range.  Generally  Anopheles  species  had  the  largest  species 
ranges  (Fig.  5),  whereas  species  of  Deinocerites  and  Wyeomyia 
had  the  smallest  predicted  species  range. 

Figure  6  shows  number  of  species  for  each  1 -degree  grid 
cell  that  contained  observations.  The  most  species  rich  grid 
cells  often  contained  the  greatest  number  of  observations  (data 
not  shown).  Figure  7  shows  the  result  of  summing  the  poten¬ 
tial  species  ranges  of  all  species.  Greatest  species  richness  oc¬ 
curred  in  Trinidad-Tobago,  French  Guiana,  Colombia,  and 
Brazil.  Greatest  overlap  of  these  two  methods  occurred  on  the 
border  of  Costa  Rica  and  Nicaragua  and  in  French  Guiana. 

Figure  8  shows  potential  endemicity  including  a  close-up 
centred  on  Panama.  Only  models  that  resulted  in  two  or  more 
grid  cells  were  included,  which  could  preferentially  eliminate 
species  that  were  recorded  from  only  one  location.  Hotspots 
may  be  influenced  by  collecting  effort  but  Fig.  8  shows  that 
hotspots  often  occur  in  unsampled  areas. 

The  curves  resulting  from  the  Estimates  analysis  are 
given  in  Fig.  9.  Estimates  advised  that  the  coefficient  of 


variation  was  >0.5,  and  recommended  re-computing  Chao2 
using  the  Classic  instead  of  the  Bias-Corrected  option  and  re¬ 
porting  the  larger  of  Chao2  and  ICE  (which  in  this  case  was 
ICE )  as  the  best  estimate  for  incidence-based  richness.  By  the 
end  of  the  samples,  the  ICE  estimator  was  only  3%  higher 
than  the  Chao2  value  but  was  34%  higher  than  the  Sobs  curve.  Ac¬ 
cording  to  Heyer  et  al.  (1999),  for  a  complete  inventory  the  es¬ 
timators  and  Sobs  coincide  and  asymptote  together,  whereas 
for  a  relatively  under-sampled  taxon  the  estimator  curves  are 
much  higher  (e.g.  65%)  than  the  observed  curves.  In  the  most 
under-sampled  taxa,  the  Sobs  curve  may  also  be  linear  (Heyer 
et  al.,  1999),  but  this  was  not  observed  in  the  present  study.  A 
discrepancy  between  Coleman  and  Sobs  curves  in  Fig.  9  is 
evidence  of  patchiness  in  the  distribution  of  data  points,  espe¬ 
cially  for  rare  species. 

Discussion 

The  present  database  is  the  largest  digitised  collection  of  georef- 
erenced  species  occurrence  records  for  vouchered  Neotropical 
mosquitoes.  Analyses  of  species  richness  were  undertaken  to 
demonstrate  the  utility  of  these  data  to  answer  fundamental 
questions  about  mosquito  biogeography,  and  to  encourage 
others  to  make  available  their  georeferenced  mosquito  collec¬ 
tion  records.  A  worldwide  database  of  georeferenced  mosquito 
collection  records  would  enable  new  insights  into  global  pat¬ 
terns  of  mosquito  biodiversity  and  survey  history. 

Jablonski  et  al.  (2006)  identified  origination  rate,  extinction 
rate,  and  migration  as  the  fundamental  determinants  of  spatial 
patterns  of  biodiversity.  The  species-area  relationship  for  a 
country  is  expected  to  primarily  reflect  the  sampling  effort,  evo¬ 
lutionary  history,  and  the  intrinsic  ability  of  the  land  to  support 
different  mosquito  species.  Variation  in  the  productivity,  hetero¬ 
geneity  and  stability  of  the  environment  may  also  be  important 
for  species  richness.  A  latitudinal  gradient  of  increasing  biodi¬ 
versity  from  the  poles  to  the  equator  has  been  noted  for  many 
organisms  (see  Jablonski  et  al.,  2006),  including  mosquito  spe¬ 
cies  (Foley  et  al,  2007).  The  underlying  mechanism  for  this 
phenomenon  may  vary  depending  on  the  organism.  Allen  and 
Gillooly  (2006)  showed  that  for  fossil  ocean  plankton,  species 
richness  and  speciation  rates  both  peak  near  the  equator  even 
after  controlling  for  sampling  effort  and  habitat  area.  Jablonski 
et  al.  (2006)  showed  that  for  genera  and  subgenera  of  marine  bi¬ 
valves,  taxa  have  preferentially  originated  in  the  tropics  and  ex¬ 
panded  toward  the  poles  without  losing  their  tropical  presence. 
Weir  and  Schluter  (2007),  however,  found  that  for  birds  and 
mammals  faster  speciation  at  higher  latitudes  contributes  to  the 
latitudinal  diversity  gradient. 

The  latitudinal  biodiversity  gradient,  spatial  scale,  and  the 
species-time  relationship  interact  with  the  species-area  rela¬ 
tionship  (e.g.  Turner  &  Tjprve,  2005;  Drakare  et  al.,  2006; 
White  et  al.,  2006)  but  despite  this  complication,  two  thirds  of 
the  variation  in  mosquito  species  richness  for  countries  in  this 
study  is  explained  by  area.  Foley  et  al.  (2007)  showed  a  simi¬ 
lar  species-area  relationship  using  worldwide  country  species 
data.  A  pattern  of  increasing  species  numbers  with  area  was  pro¬ 
moted  by  Mac  Arthur  and  Wilson  (1967)  through  their  theory  of 
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Table  1 .  Number  of  locations  and  mosquito  species  for  countries  in  the  Mosquito  Information  Management  Project  database  compared  with  the  number 
of  species  recorded  in  the  SCC  (accessed  17  June  2006).  Where  species  number  in  mimp  is  for  one  island  of  a  country  the  number  is  preceded  by  >.  Area 
is  in  square  km. 


Country 

see 

MIMP 

%  species 

No.  locations 

Land  area 

Locations/area 

Anguilla 

- 

5 

>100* 

10 

102 

0.0980{ 

Antigua  and  Barbuda 

7 

>16 

>100* 

38 

442.6 

0.0859J 

Argentina 

183 

37 

20.2f 

20 

2  736  690 

0.0000 

The  Bahamas 

19 

4 

21.lt 

7 

10  070 

0.0007 

Barbados 

5 

8 

160.0* 

37 

431 

0.0858{ 

Belize 

91 

49 

53.8 

30 

22  806 

0.0013 

Bermuda 

1 

- 

o.ot 

0 

53.3 

0.0000 

Bolivia 

157 

9 

5-7 f 

11 

1  084  390 

0.0000 

Brazil 

447 

141 

31.5 

135 

8  456  510 

0.0000 

Cayman  Is. 

12 

20 

166.7* 

15 

262 

0.0573J 

Chile 

13 

5 

38.5 

9 

748  800 

0.0000 

Colombia 

251 

147 

58.6 

94 

1  038  700 

0.0001 

Costa  Rica 

154 

128 

83.1 

215 

50  660 

0.0042 

Cuba 

48 

1 

2.  If 

7 

110  860 

0.0001 

Dominica 

9 

19 

211.1* 

60 

754 

0.0796J 

Dominican  Republic 

41 

40 

97.6 

76 

48  380 

0.0016 

Ecuador 

118 

69 

58.5 

65 

276  840 

0.0002 

El  Salvador 

69 

22 

31.9 

15 

20  720 

0.0007 

French  Guiana 

224 

123 

54.9 

62 

89  150 

0.0007 

Grenada 

17 

22 

129.4* 

39 

344 

0.1134J 

Guadeloupe 

14 

14 

100.0 

26 

1706 

0.0152J 

Guatemala 

105 

67 

63.8 

69 

108  430 

0.0006 

Guyana 

84 

47 

56.0 

16 

196  850 

0.0001 

Haiti 

27 

17 

63.0 

15 

27  560 

0.0005 

Honduras 

82 

47 

57.3 

24 

1 1 1  890 

0.0002 

Jamaica 

57 

28 

49.1 

51 

10  831 

0.0047 

Martinique 

6 

6 

100.0 

5 

1060 

0.0047 

Mexico 

211 

108 

51.2 

107 

1  923  040 

0.0001 

Montserrat 

5 

18 

360.0* 

40 

102 

0.3922J 

Saint  Kitts  and  Nevis 

3 

>10 

>  100.0* 

19 

261 

0.0728 

Nicaragua 

81 

57 

70.4 

39 

120  254 

0.0003 

Panama 

264 

172 

65.2 

214 

75  990 

0.0028 

Paraguay 

63 

29 

46.0 

20 

397  300 

0.0001 

Peru 

130 

18 

13. 8f 

17 

1  280  000 

0.0000 

Puerto  Rico 

35 

15 

42.9 

31 

8870 

0.0035 

Saint  Lucia 

13 

7 

53.8 

13 

606 

0.0215J 

St  Vincent/Grenadines 

- 

>1 

>  100.0* 

1 

389 

0.0026 

Suriname 

161 

52 

32.3 

20 

161  470 

0.0001 

Trinidad  and  Tobago 

125 

56 

44.8 

70 

5128 

0.0137J 

Uruguay 

54 

- 

- 

0 

173  620 

0.0000 

Venezuela 

238 

98 

41.2 

95 

882  050 

0.0001 

Virgin  Islands 

14 

13 

92.9 

17 

346 

0.049 1{ 

*Country  records  where  the  SCC  could  be  updated. 

{Where  the  present  database  is  least  representative  of  known  species. 
{Where  the  density  of  collection  sites  is  high. 


island  biogeography.  Chile  was  shown  to  have  relatively  few 
species  for  its  area,  possibly  due  in  part  to  its  higher  latitude 
compared  with  the  other  countries  in  the  analysis.  In  contrast, 
Trinidad-Tobago  appears  to  have  many  species  for  its  area. 

Inspection  of  Fig.  3  suggests  that  the  expected  species  number 
for  an  area  the  size  of  Trinidad-Tobago  is  about  30,  compared 
with  the  122  species  recorded.  Foley  et  al.  (2007)  demon¬ 
strated  that  island  countries  have  higher  endemicity  and  are  pos¬ 
sibly  better  sampled  for  mosquitoes  than  are  mainland  countries. 


Trinidad-Tobago  is  species  rich,  however,  even  compared  with 
other  island  nations.  The  species  richness  of  Trinidad-Tobago 
has  also  been  shown  for  bats  and  passerine  birds  (Koopman, 
1958).  These  islands  are  considered  continental,  and  would 
have  been  joined  to  the  mainland  during  the  Pleistocene 
(Koopman,  1958).  Only  33  species,  however,  are  common  between 
Trinidad-Tobago  and  Venezuela  (the  closest  continental  area), 
including  five  of  seven  Wyeomyia ,  three  of  eight  Anopheles,  six 
of  13  Culex,  and  five  of  seven  Aedes.  Trinidad  showed  a  high 
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Fig.  3.  Mosquito  log-log  species-area  relationship  for  42  countries 
covered  by  the  mimp  database.  Species  numbers  were  obtained  from  the 
SCC  (or  the  mimp  database,  whichever  was  highest).  A  linear  regression 
line  of  best  fit  is  shown. 


potential  species  richness  and  endemism  based  on  ENM,  support¬ 
ing  the  idea  of  a  generalised  environmental  suitability  for  species 
generation  and/or  survival  on  that  island. 

Potential  hotspots  in  species  endemism  and  richness  could 
provide  a  guide  to  the  most  productive  locations  for  future  mos¬ 
quito  biodiversity  surveys.  Some  Neotropical  areas  are  known 
hotspots  in  diversity  and  endemism  for  plants  and  vertebrates, 
that  is  Brazil’s  Atlantic  Forest,  the  Caribbean  islands,  Tropical 
Andes,  and  Mesoamerica  (Myers  et  al.,  2000).  Belkin  et  al. 
(1965)  regarded  the  mosquito  fauna  of  an  area  he  described 
as  Middle  America,  to  be  the  most  endemic  in  the  world. 
Comparison  of  maps  of  potential  mosquito  species  richness  and 
endemism  generated  from  ENM  (Figs  7  and  8)  revealed  much 
congruence.  Using  birds  as  an  example,  however,  Orme  et  al. 
(2005)  found  that  areas  of  species  richness  and  endemism  are 
not  usually  congruent. 

Compared  with  the  arthropod,  vertebrate,  and  plant  species 
analysed  in  Heyer  et  al.  (1999),  the  mosquito  curves  in  Fig.  9 


Fig.  4.  Log  potential  species  range  (number  of  2.5  min  grids)  accord¬ 
ing  to  bioclim  ecological  niche  modelling  for  mosquito  species  in  the 
mimp  database.  Only  predicted  species  ranges  of  two  to  100  000  grid 
cells  were  used  in  the  analysis  of  endemicity  (Fig.  8). 


suggests  an  inventory  that  is  intermediate  in  completeness. 
Values  for  Sobs  and  ICE  were  still  rising  for  the  mosquito  data 
rather  than  reaching  an  asymptote,  and  the  uniques  curve  is 
level  rather  than  declining.  For  a  complete  inventory,  values 
for  uniques  tend  toward  zero,  as  they  will  have  been  observed 
multiple  times.  It  is  possible  that  the  rarity  of  many  mosquito 
species  is  an  artefact,  perhaps  by  non-random  sampling,  which 
distorts  the  results.  This  possibility  and  the  patchiness  in  the 
distribution  of  data  points,  especially  for  rare  species,  suggest 
that  care  should  be  exercised  in  the  interpretation  of  species 
richness  estimates  from  the  mosquito  data. 

One  of  the  few  studies  of  mosquito  species  richness  was 
conducted  by  Montes  (2005)  in  the  Serra  do  Cantareira  State 
Park  of  Brazil.  That  author  found  that  the  forest  environment 
had  the  highest  species  richness  and  the  peri-domestic  environ¬ 
ment  had  the  most  dominant  species  ( Culex  vaxus  Dyar).  Loss 
of  species  and  an  increase  in  dominance  are  a  typical  outcome 
of  environmental  degradation  (Magurran,  2004).  Lande  et  al. 
(2000)  noted  that  species  accumulation  curves  might  be  un¬ 
reliable  when  there  is  a  mixture  of  assemblages  that  differ  in 
species  richness  and  evenness.  Magurran  (2004)  noted  that 
samples  taken  from  an  assemblage  where  one  species  domi¬ 
nates  and  the  others  are  rare  would  tend  to  under-estimate  rich¬ 
ness.  The  most  commonly  encountered  species  in  the  present 
database  was  Anopheles  albimanus  and  23%  of  species  were 
recorded  from  just  one  location,  suggesting  that  species  even¬ 
ness  was  low.  Researchers  should  be  cognizant  of  the  impact  of 
humans  on  the  diversity  and  composition  of  mosquito  species 
and  the  potential  effect  of  development  on  species  richness 
estimators. 

In  addition,  climate  change  severely  impacts  the  species 
ranges  of  many  organisms  and  is  an  important  force  for  change, 
now  and  in  the  future  (Parmesan  &  Yohe,  2003).  It  is  evident  that 
anthropogenic  changes  in  habitats  and  climate  affect  inverte¬ 
brate  diversity  (Thomas  et  al.,  2004)  and  community  processes 
such  as  insect  pollination  (Biesmeijer  et  al.,  2006).  Changes  in 
land  cover  and/or  climate  may  have  important  effects  on  the  dis¬ 
tribution  and  intensity  of  mosquito-borne  diseases  on  regional 
scales  (e.g.  Martens  et  al.,  1999;  Lindsay  &  Bayoh,  2004;  Zhou 
et  al.,  2004;  Munga  et  al.,  2006).  Rubio-Palis  and  Zimmerman 
(1997)  concluded  that  irrigation,  clearing  of  tropical  forest  for 
subsistence  agriculture,  animal  husbandry  and  mineral  exploita¬ 
tion,  and  construction  of  dams  complicate  an  ecoregional  ap¬ 
proach  to  classifying  vector  distribution  in  the  Neotropics.  It  is 
likely  that  the  distributions  of  non-vector  species  of  mosquito 
are  also  affected  by  the  activities  of  a  growing  human  population 
in  the  Neotropical  region.  Databases  of  mosquito  collection  data 
may  assist  in  understanding  the  scope  and  intensity  of  human- 
mediated  environmental  changes  on  a  regional  or  global  scale. 

A  number  of  assumptions  and  limitations  are  inherent  in  the 
present  study.  For  instance,  it  was  assumed  that  sampling  is 
uniform  and  species-blind,  that  is  a  complete  species  inven¬ 
tory.  Hijmans  et  al.  (2000)  identified  four  types  of  bias  that 
could  apply  in  the  present  case,  namely:  species  bias  (e.g.  over- 
sampling  species  of  Anopheles  due  to  greater  abundance  or  in 
connection  to  malaria  studies);  species-area  bias  (e.g.  over- 
sampling  island  endemics  compared  with  mainland  species); 
hotspot  bias  (e.g.  over-sampling  areas  where  previous  studies 
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Fig.  5.  Mean  log  potential  species  range  (2.5 
min  grids)  +  1  SD  according  to  bioclim  eco¬ 
logical  niche  modelling  in  diva-gis  for  mosquito 
genera  in  the  mimp  database.  Only  data  for  gen¬ 
era  with  five  or  more  species  are  shown. 
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indicated  a  high  species  richness);  and  infrastructure  bias  (e.g. 
over-sampling  near  roads  and  towns).  An  additional  source  of 
bias  is  reporting  bias  (i.e.  not  reporting  unsuccessful  attempts 
to  collect  mosquitoes).  Sampling  bias  will  favour  certain  spe¬ 


cies,  so  the  database  should  be  viewed  as  a  record  of  species 
presence  rather  than  absence. 

It  was  beyond  the  scope  of  this  study  to  evaluate  the  ecolog¬ 
ical  realism  of  distribution  models.  Estimates  of  species  ranges 


Fig.  6.  Observed  species  richness  for 
1 -degree  grids  calculated  in  diva-gis  from 
1853  locations. 
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Fig.  7.  Map  of  species  richness  created  by 
summing  the  potential  distribution  (bio- 
clim)  of  species  from  the  mimp  database. 


from  ENM  did  not  factor  in  historical  effects  on  species  dis¬ 
persal  and  survival,  and  may  therefore  overestimate  some  cur¬ 
rent  species  ranges.  Species  whose  predicted  distribution 
covered  a  minimum  of  two  grid  cells  were  included  in  the  esti¬ 


mates  of  species  richness  and  endemism  but  additional  sam¬ 
pling  will  probably  increase  predicted  range,  thereby  affecting 
these  estimates.  A  better  assessment  of  potential  distribution 
will  require  greater  sampling,  additional  environmental  layers, 


Fig.  8.  Predicted  hotspots  in  endemicity 
based  on  the  mimp  database.  Hotspots  are 
shown  by  darker  colour  grids  derived  by 
summing  grids  of  species  potential  range 
(/z  =  255),  weighted  so  that  species  pre¬ 
dicted  to  have  relatively  restricted  range 
were  given  greater  weight  than  species  pre¬ 
dicted  to  be  more  widespread.  For  display 
purposes,  hotspots  were  accentuated  by  cal¬ 
culating  a  9  x  9  neighbourhood  size  based 
on  the  maximum  grid  value.  Inset  shows 
collection  locations  on  the  endemicity  map 
centred  over  Panama  to  show  that  potential 
hotspots  in  endemicity  can  occur  beyond 
currently  sampled  areas. 
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No.  grids 

Fig.  9.  Species  richness  estimators  and  patchiness  indicators  for  mos¬ 
quito  species  from  the  Mosquito  Information  Management  Project  data¬ 
base  calculated  with  the  program  Estimates.  Sobs  =  empirical  species 
accumulation  curve;  ICE  =  coverage  estimator  of  Chao  and  Lee  (1992); 
Cole  Rarefaction  =  Coleman  curve,  a  patchiness  indicator,  of  Coleman 
(1981);  uniques  =  number  of  species  occurring  once  in  just  one  grid 
cell;  duplicates  =  number  of  species  occurring  in  just  two  grid  cells. 

and  a  statistical  treatment  of  the  reliability  of  distribution 
models,  as  is  available  in  the  modelling  procedure  Genetic 
Algorithm  for  Rule  Set  Prediction  (Stockwell  &  Noble,  1992) 
available  in  the  desktop  garp  software. 

Another  limiting  factor  is  that  mosquito  taxonomy  is  still 
largely  at  the  alpha  stage  of  species  discovery  and  description 
(Zavortink,  1990).  Belkin  et  al.  (1965)  speculated  that  over 
1000  species  would  eventually  be  described  from  Middle 
America.  Middle  America  included  the  southern  U.S.A.  and 
only  a  part  of  Colombia,  Ecuador,  and  Peru,  so  it  does  not  ex¬ 
actly  coincide  with  the  country  coverage  of  the  present  data¬ 
base.  As  of  22  September  2006,  however,  the  SCC  reported 
852  different  species  from  the  countries  covered  by  Belkin 
et  al.  (1965),  that  is  U.S.A.  plus  the  42  countries  in  the  present 
database,  less:  Argentina,  Bolivia,  Brazil,  Chile,  Paraguay,  and 
Uruguay.  The  SCC  reported  998  different  species  from  the  42 
countries  covered  by  the  present  database,  but  this  is  far  from 
a  complete  species  inventory  for  the  region.  The  present  data 
set  includes  collection  details  for  about  half  that  number  of 
species.  About  half  the  records  in  the  original  mimp  database, 
however,  were  not  included  in  the  data  set  presented  here  due 
to  a  lack  of  a  georeference  or  because  of  uncertain  species 
identification. 

Faran  et  al.  (1984)  discussed  the  limitations  of  the  mimp  data¬ 
base,  which  included  variable  coverage  (good  for  the  Canal 
Zone  of  Panama  and  poor  for  Amazonia  and  Patagonia),  poten¬ 
tial  bias  in  the  type  of  habitat  surveyed  (towards  ground  pools), 
and  a  lack  of  uniformity  in  collection  method.  Most  records  in 
the  mimp  database  are  larval  collections  and  Fig.  2  indicates  that 
the  five  most  commonly  collected  species  were  ground  pool 
breeders,  which  may  indicate  a  collection  bias  towards  ground 
pools.  Larvae  of  the  sixth  most  commonly  collected  mosquito, 


Limatus  durhamii  Theobald,  however,  are  usually  found  in  nat¬ 
ural  container  habitats,  which  suggest  that  this  bias,  if  present, 
is  not  absolute. 

Many  uses  of  primary  species-occurrence  data  are  possible 
(Chapman,  2005c)  and  methods  are  available  to  ensure  data 
quality  (e.g.  Chapman,  2005  a,b).  The  database  could  be 
improved  by  including  estimates  of  georeference  error,  e.g. 
by  the  point-radius  method  (Wieczorek  et  al.,  2004).  Almost 
9000  mosquito  records  in  the  original  mimp  database  lacked 
a  georeference  but  could  be  added  with  new  software  (e.g. 
BioGeoMancer,  http://www.biogeomancer.org/)  that  can  parse 
text  location  descriptions  in  batch  mode  and  calculate  geo¬ 
graphic  coordinates.  Most  (38  of  42)  countries  had  records  that 
lacked  a  georeference;  no  country  bias  could  be  discerned. 
Inclusion  of  these  records  may  increase  the  number  of  species 
for  each  country  but  probably  not  substantially.  For  example, 
the  data  for  Argentina,  the  least  representative  for  known  spe¬ 
cies  (Table  1),  would  gain  22  records  (4%)  and  two  additional 
species  (a  5%  increase).  Over  7000  georeferenced  records  from 
Egypt,  Israel,  and  Kenya  were  present  in  the  original  database 
but  were  not  included  in  this  study. 

The  addition  to  this  database  of  other  georeferenced  col¬ 
lection  records  from  the  Neotropics  and  other  parts  of  the 
world  would  advance  the  reliability  of  species  richness 
estimates  and  knowledge  of  global  mosquito  biogeography. 
Thousands  of  mosquito  collection  records  remain  to  be  digi¬ 
tised  at  the  WRBU,  and  many  more  exist  in  other  museums 
and  research  institutes  around  the  world.  The  availability  of 
mosquito  collections  and  associated  data  may  reduce  the  cost 
of  studying  the  transmission  of  many  mosquito-borne  patho¬ 
gens  (Suarez  &  Tsutsui,  2004).  Despite  their  value,  collec¬ 
tion  records  are  often  not  digitised,  are  not  easily  accessible, 
and  as  was  learnt  for  the  mimp  database,  vulnerable  to  loss. 
Species  occurrence  data  for  many  organisms,  however,  are 
increasingly  available  electronically  (Graham  et  al.,  2004), 
and  the  mimp  database  and  cleaned  subset  presented  here  will 
be  accessible  through  the  Smithsonian  Institution’s  online 
collections  database.  An  online  database  of  worldwide  mos¬ 
quito  collection  records  would  have  many  benefits.  For  ex¬ 
ample,  it  would  allow  the  high  resolution  prediction  of  the 
potential  distribution  for  each  species,  allow  powerful  in¬ 
sights  into  mosquito  community  structure  and  ecological  and 
climatic  correlates  to  species  occurrence  (ecological  niche), 
enable  predictions  about  the  potential  spread  of  exotic  mos¬ 
quito  introductions,  help  identify  cryptic  evolutionary  line¬ 
ages  that  differ  in  geographic  or  ecological  space,  allow  the 
location  of  biodiversity  hotspots,  and  enable  the  identifica¬ 
tion  of  under-sampled  areas  in  need  of  further  mosquito 
collecting.  An  online  repository  of  mosquito  collection  infor¬ 
mation  may  also  encourage  the  standardisation  of  collection 
reporting,  and  the  digitising  and  contribution  of  past  collec¬ 
tion  records.  Suarez  and  Tsutsui  (2004)  concluded  that  the 
rate  collection  records  are  entered  into  databases  and  made 
accessible  must  be  increased  to  benefit  society.  Mapping 
tools  and  the  Internet  are  now  available  to  enrich  the  informa¬ 
tion  content  and  global  reach  of  mosquito  collection  data¬ 
bases  by  enabling  online  users  to  analyse  records  and  visualise 
results  in  spatially  explicit  ways. 
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